This example has been auto-generated from the examples/
folder at GitHub repository.
Assessing People’s Skills
# Activate local environment, see `Project.toml`
import Pkg;
Pkg.activate("..");
Pkg.instantiate();
The goal of this demo is to demonstrate the use of the @node
and @rule
macros, which allow the user to define custom factor nodes and associated update rules respectively. We will introduce these macros in the context of a root cause analysis on a student's test results. This demo is inspired by Chapter 2 of "Model-Based Machine Learning" by Winn et al.
Problem Statement
We consider a student who takes a test that consists of three questions. Answering each question correctly requires a combination of skill and attitude. More precisely, has the student studied for the test, and have they partied the night before?
We model the result for question $i$ as a continuous variable $r_i\in[0,1]$, and skill/attitude as a binary variable $s_i \in \{0, 1\}$, where $s_1$ represents whether the student has partied, and $s_2$ and $s_3$ represent whether the student has studied the chapters for the corresponding questions.
We assume the following logic:
- If the student is alert (has not partied), then they will score on the first question;
- If the student is alert or has studied chapter two, then they will score on question two;
- If the student can answer question two and has studied chapter three, then they will score on question three.
Generative Model Definition
To model the probability for correct answers, we assume a latent state variable $t_i \in \{0,1\}$. The dependencies between the variables can then be modeled by the following Bayesian network:
(s_1) (s_2) (s_3)
| | |
v v v
(t_1)-->(t_2)-->(t_3)
| | |
v v v
(r_1) (r_2) (r_3)
As prior beliefs, we assume that a student is equally likely to study/party or not: $s_i \sim Ber(0.5)\,,$ for all $i$. Next, we model the domain logic as $\begin{aligned} t_1 &= ¬s_1\\ t_2 &= t_1 ∨ s_2\\ t_3 &= t_2 ∧ s_3\,. \end{aligned}$ For the scoring results we might not have a specific forward model in mind. However, we can define a backward mapping, from continuous results to discrete latent variables, as $t_i \sim Ber(s_i)\,,$ for all $i$.
Custom Nodes and Rules
The backward mapping from results to latents is quite specific to our application. Moreover, it does not define a proper generative forward model. In order to still define a full generative model for our application, we can define a custom Score
node and define an update rule that implements the backward mapping from scores to latents as a message.
In RxInfer, the @node
macro defines a factor node. This macro accepts the new node type, an indicator for a stochastic or deterministic relationship, and a list of interfaces.
using RxInfer, Random
# Create Score node
struct Score end
@node Score Stochastic [out, in]
We can now define the backward mapping as a sum-product message through the @rule
macro. This macro accepts the node type, the (outbound) interface on which the message is sent, any relevant constraints, and the message/distribution types on the remaining (inbound) interfaces.
# Adding update rule for the Score node
@rule Score(:in, Marginalisation) (q_out::PointMass,) = begin
return Bernoulli(mean(q_out))
end
Generative Model Specification
We can now build the full generative model.
# GraphPPL.jl exports the `@model` macro for model specification
# It accepts a regular Julia function and builds an FFG under the hood
@model function skill_model(r)
local s
# Priors
for i in eachindex(r)
s[i] ~ Bernoulli(0.5)
end
# Domain logic
t[1] ~ ¬s[1]
t[2] ~ t[1] || s[2]
t[3] ~ t[2] && s[3]
# Results
for i in eachindex(r)
r[i] ~ Score(t[i])
end
end
Inference Specification
Let us assume that a student scored very low on all questions and set up and execute an inference algorithm.
test_results = [0.1, 0.1, 0.1]
inference_result = infer(
model=skill_model(),
data=(r=test_results,)
)
Inference results:
Posteriors | available for (s, t)
Results
# Inspect the results
map(params, inference_result.posteriors[:s])
3-element Vector{Tuple{Float64}}:
(0.9872448979591837,)
(0.06377551020408162,)
(0.4719387755102041,)
These results suggest that this particular student was very likely out on the town last night.